Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 11127 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.0 MiB |
| Average record size in memory | 96.0 B |
Variable types
| Numeric | 6 |
|---|---|
| Text | 5 |
| Categorical | 1 |
ratings_count is highly overall correlated with text_reviews_count | High correlation |
text_reviews_count is highly overall correlated with ratings_count | High correlation |
language_code is highly imbalanced (76.6%) | Imbalance |
isbn13 is highly skewed (γ1 = -21.07028799) | Skewed |
bookID has unique values | Unique |
isbn has unique values | Unique |
text_reviews_count has 625 (5.6%) zeros | Zeros |
Reproduction
| Analysis started | 2024-03-03 14:58:51.657057 |
|---|---|
| Analysis finished | 2024-03-03 14:58:58.137990 |
| Duration | 6.48 seconds |
| Software version | ydata-profiling vv4.6.5 |
| Download configuration | config.json |
bookID
Real number (ℝ)
UNIQUE 
| Distinct | 11127 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21310.939 |
| Minimum | 1 |
|---|---|
| Maximum | 45641 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1800.3 |
| Q1 | 10287 |
| median | 20287 |
| Q3 | 32104.5 |
| 95-th percentile | 43066.5 |
| Maximum | 45641 |
| Range | 45640 |
| Interquartile range (IQR) | 21817.5 |
Descriptive statistics
| Standard deviation | 13093.358 |
|---|---|
| Coefficient of variation (CV) | 0.61439611 |
| Kurtosis | -1.1463568 |
| Mean | 21310.939 |
| Median Absolute Deviation (MAD) | 10879 |
| Skewness | 0.14405166 |
| Sum | 2.3712682 × 108 |
| Variance | 1.7143602 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 34889 | 1 | < 0.1% |
| 28532 | 1 | < 0.1% |
| 28510 | 1 | < 0.1% |
| 28511 | 1 | < 0.1% |
| 28514 | 1 | < 0.1% |
| 28522 | 1 | < 0.1% |
| 28524 | 1 | < 0.1% |
| 28529 | 1 | < 0.1% |
| 28530 | 1 | < 0.1% |
| 28531 | 1 | < 0.1% |
| Other values (11117) | 11117 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 |
| Value | Count | Frequency (%) |
| 45641 | 1 | |
| 45639 | 1 | |
| 45634 | 1 | |
| 45633 | 1 | |
| 45631 | 1 | |
| 45630 | 1 | |
| 45626 | 1 | |
| 45625 | 1 | |
| 45623 | 1 | |
| 45617 | 1 |
title
Text
| Distinct | 10352 |
|---|---|
| Distinct (%) | 93.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.1 KiB |
Length
| Max length | 254 |
|---|---|
| Median length | 141 |
| Mean length | 35.749348 |
| Min length | 2 |
Characters and Unicode
| Total characters | 397783 |
|---|---|
| Distinct characters | 296 |
| Distinct categories | 17 ? |
| Distinct scripts | 8 ? |
| Distinct blocks | 9 ? |
Unique
| Unique | 9865 ? |
|---|---|
| Unique (%) | 88.7% |
Sample
| 1st row | Brown's Star Atlas: Showing All The Bright Stars With Full Instructions How To Find And Use Them For Navigational Purposes And Department Of Trade Examinations. |
|---|---|
| 2nd row | The Tolkien Fan's Medieval Reader |
| 3rd row | Streetcar Suburbs: The Process of Growth in Boston 1870-1900 |
| 4th row | Patriots (The Coming Collapse) |
| 5th row | Harry Potter and the Half-Blood Prince (Harry Potter #6) |
| Value | Count | Frequency (%) |
| the | 6692 | 10.1% |
| of | 3336 | 5.0% |
| and | 1653 | 2.5% |
| a | 1335 | 2.0% |
| 1 | 796 | 1.2% |
| in | 778 | 1.2% |
| to | 698 | 1.1% |
| 588 | 0.9% | |
| 2 | 519 | 0.8% |
| 3 | 399 | 0.6% |
| Other values (12076) | 49535 |
Most occurring characters
| Value | Count | Frequency (%) |
| 58892 | ||
| e | 36631 | 9.2% |
| o | 23592 | 5.9% |
| a | 22308 | 5.6% |
| i | 20615 | 5.2% |
| r | 20216 | 5.1% |
| n | 20032 | 5.0% |
| t | 19155 | 4.8% |
| s | 16700 | 4.2% |
| h | 13698 | 3.4% |
| Other values (286) | 145944 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 265634 | |
| Space Separator | 58892 | 14.8% |
| Uppercase Letter | 52413 | 13.2% |
| Other Punctuation | 8579 | 2.2% |
| Decimal Number | 5487 | 1.4% |
| Close Punctuation | 2765 | 0.7% |
| Open Punctuation | 2764 | 0.7% |
| Dash Punctuation | 808 | 0.2% |
| Other Letter | 373 | 0.1% |
| Math Symbol | 27 | < 0.1% |
| Other values (7) | 41 | < 0.1% |
Most frequent character per category
Other Letter
| Value | Count | Frequency (%) |
| の | 16 | 4.3% |
| 夜 | 13 | 3.5% |
| 犬 | 13 | 3.5% |
| 師 | 13 | 3.5% |
| 術 | 13 | 3.5% |
| 金 | 13 | 3.5% |
| 鋼 | 13 | 3.5% |
| 叉 | 13 | 3.5% |
| 碁 | 11 | 2.9% |
| ヒ | 11 | 2.9% |
| Other values (144) | 244 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 36631 | |
| o | 23592 | 8.9% |
| a | 22308 | 8.4% |
| i | 20615 | 7.8% |
| r | 20216 | 7.6% |
| n | 20032 | 7.5% |
| t | 19155 | 7.2% |
| s | 16700 | 6.3% |
| h | 13698 | 5.2% |
| l | 12851 | 4.8% |
| Other values (48) | 59836 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 7023 | 13.4% |
| S | 4510 | 8.6% |
| A | 3662 | 7.0% |
| C | 3562 | 6.8% |
| M | 3165 | 6.0% |
| B | 2677 | 5.1% |
| W | 2612 | 5.0% |
| P | 2599 | 5.0% |
| L | 2507 | 4.8% |
| D | 2490 | 4.8% |
| Other values (23) | 17606 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 3025 | |
| # | 2431 | |
| ' | 1397 | |
| . | 709 | 8.3% |
| / | 414 | 4.8% |
| & | 258 | 3.0% |
| ! | 135 | 1.6% |
| ? | 70 | 0.8% |
| ; | 59 | 0.7% |
| " | 50 | 0.6% |
| Other values (8) | 31 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1724 | |
| 2 | 855 | |
| 3 | 645 | 11.8% |
| 4 | 415 | 7.6% |
| 0 | 396 | 7.2% |
| 9 | 371 | 6.8% |
| 5 | 347 | 6.3% |
| 6 | 282 | 5.1% |
| 8 | 233 | 4.2% |
| 7 | 219 | 4.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 773 | |
| – | 18 | 2.2% |
| — | 15 | 1.9% |
| ― | 2 | 0.2% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ́ | 3 | |
| ̈ | 2 | |
| ̌ | 1 | 16.7% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2756 | |
| ] | 9 | 0.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2755 | |
| [ | 9 | 0.3% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 18 | |
| = | 9 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 12 | |
| ” | 1 | 7.7% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 11 | |
| ² | 1 | 8.3% |
Initial Punctuation
| Value | Count | Frequency (%) |
| ‘ | 1 | |
| “ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 58892 |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 3 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 3 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 318031 | |
| Common | 79357 | 19.9% |
| Han | 232 | 0.1% |
| Katakana | 72 | < 0.1% |
| Hiragana | 61 | < 0.1% |
| Cyrillic | 16 | < 0.1% |
| Arabic | 8 | < 0.1% |
| Inherited | 6 | < 0.1% |
Most frequent character per script
Han
| Value | Count | Frequency (%) |
| 夜 | 13 | 5.6% |
| 犬 | 13 | 5.6% |
| 師 | 13 | 5.6% |
| 術 | 13 | 5.6% |
| 金 | 13 | 5.6% |
| 鋼 | 13 | 5.6% |
| 叉 | 13 | 5.6% |
| 碁 | 11 | 4.7% |
| 之 | 8 | 3.4% |
| 鍊 | 8 | 3.4% |
| Other values (90) | 114 |
Latin
| Value | Count | Frequency (%) |
| e | 36631 | 11.5% |
| o | 23592 | 7.4% |
| a | 22308 | 7.0% |
| i | 20615 | 6.5% |
| r | 20216 | 6.4% |
| n | 20032 | 6.3% |
| t | 19155 | 6.0% |
| s | 16700 | 5.3% |
| h | 13698 | 4.3% |
| l | 12851 | 4.0% |
| Other values (73) | 112233 |
Common
| Value | Count | Frequency (%) |
| 58892 | ||
| : | 3025 | 3.8% |
| ) | 2756 | 3.5% |
| ( | 2755 | 3.5% |
| # | 2431 | 3.1% |
| 1 | 1724 | 2.2% |
| ' | 1397 | 1.8% |
| 2 | 855 | 1.1% |
| - | 773 | 1.0% |
| . | 709 | 0.9% |
| Other values (38) | 4040 | 5.1% |
Hiragana
| Value | Count | Frequency (%) |
| の | 16 | |
| ん | 5 | 8.2% |
| か | 3 | 4.9% |
| る | 3 | 4.9% |
| て | 3 | 4.9% |
| き | 3 | 4.9% |
| た | 3 | 4.9% |
| ら | 3 | 4.9% |
| な | 2 | 3.3% |
| ぜ | 2 | 3.3% |
| Other values (14) | 18 |
Katakana
| Value | Count | Frequency (%) |
| ヒ | 11 | |
| ル | 11 | |
| カ | 11 | |
| ツ | 5 | 6.9% |
| サ | 5 | 6.9% |
| バ | 5 | 6.9% |
| リ | 2 | 2.8% |
| ト | 2 | 2.8% |
| ス | 2 | 2.8% |
| ャ | 2 | 2.8% |
| Other values (14) | 16 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 4 | |
| р | 3 | |
| М | 2 | |
| т | 2 | |
| и | 2 | |
| с | 1 | 6.2% |
| е | 1 | 6.2% |
| г | 1 | 6.2% |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ل | 2 | |
| ح | 1 | |
| ا | 1 | |
| ن | 1 | |
| د | 1 |
Inherited
| Value | Count | Frequency (%) |
| ́ | 3 | |
| ̈ | 2 | |
| ̌ | 1 | 16.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 397016 | |
| None | 319 | 0.1% |
| CJK | 232 | 0.1% |
| Katakana | 75 | < 0.1% |
| Hiragana | 61 | < 0.1% |
| Punctuation | 50 | < 0.1% |
| Cyrillic | 16 | < 0.1% |
| Arabic | 8 | < 0.1% |
| Diacriticals | 6 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 58892 | ||
| e | 36631 | 9.2% |
| o | 23592 | 5.9% |
| a | 22308 | 5.6% |
| i | 20615 | 5.2% |
| r | 20216 | 5.1% |
| n | 20032 | 5.0% |
| t | 19155 | 4.8% |
| s | 16700 | 4.2% |
| h | 13698 | 3.5% |
| Other values (75) | 145177 |
None
| Value | Count | Frequency (%) |
| é | 62 | |
| á | 35 | |
| ó | 31 | 9.7% |
| í | 28 | 8.8% |
| ä | 19 | 6.0% |
| ü | 17 | 5.3% |
| ñ | 14 | 4.4% |
| ½ | 11 | 3.4% |
| 、 | 11 | 3.4% |
| è | 11 | 3.4% |
| Other values (28) | 80 |
Punctuation
| Value | Count | Frequency (%) |
| – | 18 | |
| — | 15 | |
| ’ | 12 | |
| ― | 2 | 4.0% |
| ‘ | 1 | 2.0% |
| ” | 1 | 2.0% |
| “ | 1 | 2.0% |
Hiragana
| Value | Count | Frequency (%) |
| の | 16 | |
| ん | 5 | 8.2% |
| か | 3 | 4.9% |
| る | 3 | 4.9% |
| て | 3 | 4.9% |
| き | 3 | 4.9% |
| た | 3 | 4.9% |
| ら | 3 | 4.9% |
| な | 2 | 3.3% |
| ぜ | 2 | 3.3% |
| Other values (14) | 18 |
CJK
| Value | Count | Frequency (%) |
| 夜 | 13 | 5.6% |
| 犬 | 13 | 5.6% |
| 師 | 13 | 5.6% |
| 術 | 13 | 5.6% |
| 金 | 13 | 5.6% |
| 鋼 | 13 | 5.6% |
| 叉 | 13 | 5.6% |
| 碁 | 11 | 4.7% |
| 之 | 8 | 3.4% |
| 鍊 | 8 | 3.4% |
| Other values (90) | 114 |
Katakana
| Value | Count | Frequency (%) |
| ヒ | 11 | |
| ル | 11 | |
| カ | 11 | |
| ツ | 5 | 6.7% |
| サ | 5 | 6.7% |
| バ | 5 | 6.7% |
| ー | 3 | 4.0% |
| リ | 2 | 2.7% |
| ト | 2 | 2.7% |
| ス | 2 | 2.7% |
| Other values (15) | 18 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 4 | |
| р | 3 | |
| М | 2 | |
| т | 2 | |
| и | 2 | |
| с | 1 | 6.2% |
| е | 1 | 6.2% |
| г | 1 | 6.2% |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 3 | |
| ̈ | 2 | |
| ̌ | 1 | 16.7% |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ل | 2 | |
| ح | 1 | |
| ا | 1 | |
| ن | 1 | |
| د | 1 |
authors
Text
| Distinct | 6643 |
|---|---|
| Distinct (%) | 59.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.1 KiB |
Length
| Max length | 750 |
|---|---|
| Median length | 372 |
| Mean length | 24.724005 |
| Min length | 3 |
Characters and Unicode
| Total characters | 275104 |
|---|---|
| Distinct characters | 267 |
| Distinct categories | 11 ? |
| Distinct scripts | 8 ? |
| Distinct blocks | 8 ? |
Unique
| Unique | 5282 ? |
|---|---|
| Unique (%) | 47.5% |
Sample
| 1st row | Brown Son & Ferguson |
|---|---|
| 2nd row | David E. Smith (Turgon of TheOneRing.net one of the founding members of this Tolkien website)/Verlyn Flieger/Turgon (=David E. Smith) |
| 3rd row | Sam Bass Warner Jr./Sam B. Warner |
| 4th row | James Wesley Rawles |
| 5th row | J.K. Rowling/Mary GrandPré |
| Value | Count | Frequency (%) |
| john | 279 | 0.8% |
| william | 262 | 0.8% |
| james | 228 | 0.7% |
| david | 203 | 0.6% |
| a | 191 | 0.6% |
| robert | 185 | 0.5% |
| j | 181 | 0.5% |
| stephen | 176 | 0.5% |
| richard | 157 | 0.5% |
| m | 155 | 0.5% |
| Other values (12644) | 31795 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 23758 | 8.6% |
| 23425 | 8.5% | |
| a | 22547 | 8.2% |
| r | 18119 | 6.6% |
| n | 17426 | 6.3% |
| i | 15677 | 5.7% |
| o | 14415 | 5.2% |
| l | 13273 | 4.8% |
| s | 9822 | 3.6% |
| t | 9527 | 3.5% |
| Other values (257) | 107115 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 195795 | |
| Uppercase Letter | 43440 | 15.8% |
| Space Separator | 23425 | 8.5% |
| Other Punctuation | 11849 | 4.3% |
| Other Letter | 378 | 0.1% |
| Dash Punctuation | 200 | 0.1% |
| Close Punctuation | 5 | < 0.1% |
| Open Punctuation | 5 | < 0.1% |
| Decimal Number | 4 | < 0.1% |
| Format | 2 | < 0.1% |
Most frequent character per category
Other Letter
| Value | Count | Frequency (%) |
| ا | 25 | 6.6% |
| ر | 22 | 5.8% |
| ن | 19 | 5.0% |
| ل | 18 | 4.8% |
| ب | 15 | 4.0% |
| م | 13 | 3.4% |
| ج | 12 | 3.2% |
| 川 | 9 | 2.4% |
| ي | 9 | 2.4% |
| 方 | 8 | 2.1% |
| Other values (99) | 228 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 23758 | |
| a | 22547 | |
| r | 18119 | |
| n | 17426 | 8.9% |
| i | 15677 | 8.0% |
| o | 14415 | 7.4% |
| l | 13273 | 6.8% |
| s | 9822 | 5.0% |
| t | 9527 | 4.9% |
| h | 7628 | 3.9% |
| Other values (86) | 43603 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 3731 | 8.6% |
| S | 3497 | 8.1% |
| J | 3286 | 7.6% |
| C | 3074 | 7.1% |
| R | 2655 | 6.1% |
| A | 2611 | 6.0% |
| B | 2520 | 5.8% |
| D | 2490 | 5.7% |
| H | 2232 | 5.1% |
| P | 2200 | 5.1% |
| Other values (37) | 15144 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 8117 | |
| . | 3577 | |
| ' | 148 | 1.2% |
| ! | 4 | < 0.1% |
| " | 2 | < 0.1% |
| & | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 2 | |
| 1 | 1 | |
| 2 | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 23425 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 200 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 5 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 5 |
Format
| Value | Count | Frequency (%) |
| | 2 |
Math Symbol
| Value | Count | Frequency (%) |
| = | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 239152 | |
| Common | 35489 | 12.9% |
| Arabic | 187 | 0.1% |
| Han | 173 | 0.1% |
| Greek | 53 | < 0.1% |
| Cyrillic | 30 | < 0.1% |
| Hiragana | 18 | < 0.1% |
| Inherited | 2 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 23758 | 9.9% |
| a | 22547 | 9.4% |
| r | 18119 | 7.6% |
| n | 17426 | 7.3% |
| i | 15677 | 6.6% |
| o | 14415 | 6.0% |
| l | 13273 | 5.6% |
| s | 9822 | 4.1% |
| t | 9527 | 4.0% |
| h | 7628 | 3.2% |
| Other values (88) | 86960 |
Han
| Value | Count | Frequency (%) |
| 川 | 9 | 5.2% |
| 方 | 8 | 4.6% |
| 荒 | 8 | 4.6% |
| 二 | 8 | 4.6% |
| 郁 | 8 | 4.6% |
| 仁 | 8 | 4.6% |
| 弘 | 8 | 4.6% |
| 伊 | 8 | 4.6% |
| 藤 | 8 | 4.6% |
| 潤 | 8 | 4.6% |
| Other values (56) | 92 |
Arabic
| Value | Count | Frequency (%) |
| ا | 25 | |
| ر | 22 | |
| ن | 19 | |
| ل | 18 | |
| ب | 15 | 8.0% |
| م | 13 | 7.0% |
| ج | 12 | 6.4% |
| ي | 9 | 4.8% |
| ی | 7 | 3.7% |
| خ | 6 | 3.2% |
| Other values (19) | 41 |
Greek
| Value | Count | Frequency (%) |
| ο | 7 | |
| α | 5 | 9.4% |
| υ | 4 | 7.5% |
| ς | 4 | 7.5% |
| λ | 4 | 7.5% |
| ί | 3 | 5.7% |
| ι | 3 | 5.7% |
| κ | 2 | 3.8% |
| τ | 2 | 3.8% |
| π | 2 | 3.8% |
| Other values (17) | 17 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 5 | |
| л | 4 | |
| и | 3 | 10.0% |
| н | 2 | 6.7% |
| о | 2 | 6.7% |
| в | 2 | 6.7% |
| А | 1 | 3.3% |
| В | 1 | 3.3% |
| ь | 1 | 3.3% |
| е | 1 | 3.3% |
| Other values (8) | 8 |
Common
| Value | Count | Frequency (%) |
| 23425 | ||
| / | 8117 | 22.9% |
| . | 3577 | 10.1% |
| - | 200 | 0.6% |
| ' | 148 | 0.4% |
| ) | 5 | < 0.1% |
| ( | 5 | < 0.1% |
| ! | 4 | < 0.1% |
| 9 | 2 | < 0.1% |
| " | 2 | < 0.1% |
| Other values (4) | 4 | < 0.1% |
Hiragana
| Value | Count | Frequency (%) |
| き | 2 | |
| た | 2 | |
| し | 2 | |
| か | 2 | |
| ぐ | 1 | 5.6% |
| つ | 1 | 5.6% |
| み | 1 | 5.6% |
| ず | 1 | 5.6% |
| あ | 1 | 5.6% |
| ゆ | 1 | 5.6% |
| Other values (4) | 4 |
Inherited
| Value | Count | Frequency (%) |
| | 2 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 274080 | |
| None | 613 | 0.2% |
| Arabic | 187 | 0.1% |
| CJK | 173 | 0.1% |
| Cyrillic | 30 | < 0.1% |
| Hiragana | 18 | < 0.1% |
| Punctuation | 2 | < 0.1% |
| Latin Ext Additional | 1 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 23758 | 8.7% |
| 23425 | 8.5% | |
| a | 22547 | 8.2% |
| r | 18119 | 6.6% |
| n | 17426 | 6.4% |
| i | 15677 | 5.7% |
| o | 14415 | 5.3% |
| l | 13273 | 4.8% |
| s | 9822 | 3.6% |
| t | 9527 | 3.5% |
| Other values (56) | 106091 |
None
| Value | Count | Frequency (%) |
| é | 115 | |
| í | 85 | |
| á | 57 | 9.3% |
| ō | 45 | 7.3% |
| ó | 34 | 5.5% |
| ë | 20 | 3.3% |
| è | 20 | 3.3% |
| ü | 19 | 3.1% |
| ł | 17 | 2.8% |
| ï | 15 | 2.4% |
| Other values (62) | 186 |
Arabic
| Value | Count | Frequency (%) |
| ا | 25 | |
| ر | 22 | |
| ن | 19 | |
| ل | 18 | |
| ب | 15 | 8.0% |
| م | 13 | 7.0% |
| ج | 12 | 6.4% |
| ي | 9 | 4.8% |
| ی | 7 | 3.7% |
| خ | 6 | 3.2% |
| Other values (19) | 41 |
CJK
| Value | Count | Frequency (%) |
| 川 | 9 | 5.2% |
| 方 | 8 | 4.6% |
| 荒 | 8 | 4.6% |
| 二 | 8 | 4.6% |
| 郁 | 8 | 4.6% |
| 仁 | 8 | 4.6% |
| 弘 | 8 | 4.6% |
| 伊 | 8 | 4.6% |
| 藤 | 8 | 4.6% |
| 潤 | 8 | 4.6% |
| Other values (56) | 92 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 5 | |
| л | 4 | |
| и | 3 | 10.0% |
| н | 2 | 6.7% |
| о | 2 | 6.7% |
| в | 2 | 6.7% |
| А | 1 | 3.3% |
| В | 1 | 3.3% |
| ь | 1 | 3.3% |
| е | 1 | 3.3% |
| Other values (8) | 8 |
Hiragana
| Value | Count | Frequency (%) |
| き | 2 | |
| た | 2 | |
| し | 2 | |
| か | 2 | |
| ぐ | 1 | 5.6% |
| つ | 1 | 5.6% |
| み | 1 | 5.6% |
| ず | 1 | 5.6% |
| あ | 1 | 5.6% |
| ゆ | 1 | 5.6% |
| Other values (4) | 4 |
Punctuation
| Value | Count | Frequency (%) |
| | 2 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ệ | 1 |
average_rating
Real number (ℝ)
| Distinct | 209 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9336308 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 26 |
| Zeros (%) | 0.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3.44 |
| Q1 | 3.77 |
| median | 3.96 |
| Q3 | 4.135 |
| 95-th percentile | 4.38 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0.365 |
Descriptive statistics
| Standard deviation | 0.35244503 |
|---|---|
| Coefficient of variation (CV) | 0.089597893 |
| Kurtosis | 36.721777 |
| Mean | 3.9336308 |
| Median Absolute Deviation (MAD) | 0.18 |
| Skewness | -3.6383114 |
| Sum | 43769.51 |
| Variance | 0.1242175 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 219 | 2.0% |
| 3.96 | 195 | 1.8% |
| 4.02 | 178 | 1.6% |
| 3.94 | 176 | 1.6% |
| 4.07 | 172 | 1.5% |
| 3.92 | 168 | 1.5% |
| 3.93 | 168 | 1.5% |
| 4.05 | 168 | 1.5% |
| 3.83 | 166 | 1.5% |
| 3.89 | 166 | 1.5% |
| Other values (199) | 9351 |
| Value | Count | Frequency (%) |
| 0 | 26 | |
| 1 | 2 | < 0.1% |
| 1.67 | 1 | < 0.1% |
| 2 | 6 | 0.1% |
| 2.33 | 1 | < 0.1% |
| 2.4 | 1 | < 0.1% |
| 2.5 | 1 | < 0.1% |
| 2.55 | 1 | < 0.1% |
| 2.61 | 1 | < 0.1% |
| 2.62 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 5 | 22 | |
| 4.91 | 1 | < 0.1% |
| 4.88 | 1 | < 0.1% |
| 4.86 | 1 | < 0.1% |
| 4.83 | 1 | < 0.1% |
| 4.82 | 1 | < 0.1% |
| 4.8 | 1 | < 0.1% |
| 4.78 | 2 | < 0.1% |
| 4.76 | 1 | < 0.1% |
| 4.75 | 2 | < 0.1% |
isbn
Text
UNIQUE 
| Distinct | 11127 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.1 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 9.2079626 |
| Min length | 7 |
Characters and Unicode
| Total characters | 102457 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 11127 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 851742718 |
|---|---|
| 2nd row | 1593600119 |
| 3rd row | 674842111 |
| 4th row | 156384155X |
| 5th row | 439785960 |
| Value | Count | Frequency (%) |
| 851742718 | 1 | < 0.1% |
| 517226952 | 1 | < 0.1% |
| 674842111 | 1 | < 0.1% |
| 156384155x | 1 | < 0.1% |
| 439785960 | 1 | < 0.1% |
| 439358078 | 1 | < 0.1% |
| 439554896 | 1 | < 0.1% |
| 043965548x | 1 | < 0.1% |
| 439682584 | 1 | < 0.1% |
| 976540606 | 1 | < 0.1% |
| Other values (11117) | 11117 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 12578 | |
| 4 | 11497 | |
| 0 | 11182 | |
| 5 | 10545 | |
| 3 | 10381 | |
| 2 | 9465 | |
| 7 | 9347 | |
| 8 | 9105 | |
| 6 | 9079 | |
| 9 | 8293 | |
| Other values (2) | 985 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 101472 | |
| Uppercase Letter | 984 | 1.0% |
| Lowercase Letter | 1 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 12578 | |
| 4 | 11497 | |
| 0 | 11182 | |
| 5 | 10545 | |
| 3 | 10381 | |
| 2 | 9465 | |
| 7 | 9347 | |
| 8 | 9105 | |
| 6 | 9079 | |
| 9 | 8293 |
Uppercase Letter
| Value | Count | Frequency (%) |
| X | 984 |
Lowercase Letter
| Value | Count | Frequency (%) |
| x | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 101472 | |
| Latin | 985 | 1.0% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 12578 | |
| 4 | 11497 | |
| 0 | 11182 | |
| 5 | 10545 | |
| 3 | 10381 | |
| 2 | 9465 | |
| 7 | 9347 | |
| 8 | 9105 | |
| 6 | 9079 | |
| 9 | 8293 |
Latin
| Value | Count | Frequency (%) |
| X | 984 | |
| x | 1 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 102457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 12578 | |
| 4 | 11497 | |
| 0 | 11182 | |
| 5 | 10545 | |
| 3 | 10381 | |
| 2 | 9465 | |
| 7 | 9347 | |
| 8 | 9105 | |
| 6 | 9079 | |
| 9 | 8293 | |
| Other values (2) | 985 | 1.0% |
isbn13
Real number (ℝ)
SKEWED 
| Distinct | 239 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.7598879 × 1012 |
| Minimum | 8.9870598 × 109 |
|---|---|
| Maximum | 9.79001 × 1012 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.1 KiB |
Quantile statistics
| Minimum | 8.9870598 × 109 |
|---|---|
| 5-th percentile | 9.78006 × 1012 |
| Q1 | 9.78035 × 1012 |
| median | 9.78059 × 1012 |
| Q3 | 9.78087 × 1012 |
| 95-th percentile | 9.78193 × 1012 |
| Maximum | 9.79001 × 1012 |
| Range | 9.7810229 × 1012 |
| Interquartile range (IQR) | 5.2 × 108 |
Descriptive statistics
| Standard deviation | 4.428964 × 1011 |
|---|---|
| Coefficient of variation (CV) | 0.04537925 |
| Kurtosis | 442.6346 |
| Mean | 9.7598879 × 1012 |
| Median Absolute Deviation (MAD) | 2.5 × 108 |
| Skewness | -21.070288 |
| Sum | 1.0859827 × 1017 |
| Variance | 1.9615722 × 1023 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9.78014 × 1012 | 662 | 5.9% |
| 9.78006 × 1012 | 654 | 5.9% |
| 9.78045 × 1012 | 455 | 4.1% |
| 9.78039 × 1012 | 389 | 3.5% |
| 9.78074 × 1012 | 374 | 3.4% |
| 9.78031 × 1012 | 349 | 3.1% |
| 9.78159 × 1012 | 344 | 3.1% |
| 9.78081 × 1012 | 342 | 3.1% |
| 9.78068 × 1012 | 308 | 2.8% |
| 9.78038 × 1012 | 300 | 2.7% |
| Other values (229) | 6950 |
| Value | Count | Frequency (%) |
| 8987059752 | 1 | |
| 2.004913 × 1010 | 1 | |
| 2.375500432 × 1010 | 1 | |
| 3.44060546 × 1010 | 1 | |
| 4.908600776 × 1010 | 1 | |
| 7.399914077 × 1010 | 1 | |
| 7.399925491 × 1010 | 1 | |
| 7.399976844 × 1010 | 1 | |
| 7.399996082 × 1010 | 1 | |
| 7.609202599 × 1010 | 1 |
| Value | Count | Frequency (%) |
| 9.79001 × 1012 | 1 | < 0.1% |
| 9.79 × 1012 | 1 | < 0.1% |
| 9.78988 × 1012 | 3 | < 0.1% |
| 9.78987 × 1012 | 1 | < 0.1% |
| 9.78986 × 1012 | 8 | |
| 9.78983 × 1012 | 1 | < 0.1% |
| 9.78981 × 1012 | 3 | < 0.1% |
| 9.78979 × 1012 | 1 | < 0.1% |
| 9.78977 × 1012 | 1 | < 0.1% |
| 9.78972 × 1012 | 6 |
language_code
Categorical
IMBALANCE 
| Distinct | 27 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.1 KiB |
| eng | |
|---|---|
| en-US | |
| spa | 218 |
| en-GB | 214 |
| fre | 144 |
| Other values (22) | 231 |
Length
| Max length | 5 |
|---|---|
| Median length | 3 |
| Mean length | 3.2928912 |
| Min length | 2 |
Characters and Unicode
| Total characters | 36640 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 10 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | eng |
|---|---|
| 2nd row | eng |
| 3rd row | en-US |
| 4th row | eng |
| 5th row | eng |
Common Values
| Value | Count | Frequency (%) |
| eng | 8911 | |
| en-US | 1409 | 12.7% |
| spa | 218 | 2.0% |
| en-GB | 214 | 1.9% |
| fre | 144 | 1.3% |
| ger | 99 | 0.9% |
| jpn | 46 | 0.4% |
| mul | 19 | 0.2% |
| zho | 14 | 0.1% |
| grc | 11 | 0.1% |
| Other values (17) | 42 | 0.4% |
Length
| Value | Count | Frequency (%) |
| eng | 8911 | |
| en-us | 1409 | 12.7% |
| spa | 218 | 2.0% |
| en-gb | 214 | 1.9% |
| fre | 144 | 1.3% |
| ger | 99 | 0.9% |
| jpn | 46 | 0.4% |
| mul | 19 | 0.2% |
| zho | 14 | 0.1% |
| grc | 11 | 0.1% |
| Other values (17) | 42 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 10791 | |
| n | 10592 | |
| g | 9024 | |
| - | 1630 | 4.4% |
| U | 1409 | 3.8% |
| S | 1409 | 3.8% |
| p | 275 | 0.8% |
| r | 270 | 0.7% |
| a | 231 | 0.6% |
| s | 224 | 0.6% |
| Other values (16) | 785 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 31750 | |
| Uppercase Letter | 3260 | 8.9% |
| Dash Punctuation | 1630 | 4.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 10791 | |
| n | 10592 | |
| g | 9024 | |
| p | 275 | 0.9% |
| r | 270 | 0.9% |
| a | 231 | 0.7% |
| s | 224 | 0.7% |
| f | 144 | 0.5% |
| j | 46 | 0.1% |
| l | 27 | 0.1% |
| Other values (9) | 126 | 0.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 1409 | |
| S | 1409 | |
| G | 214 | 6.6% |
| B | 214 | 6.6% |
| C | 7 | 0.2% |
| A | 7 | 0.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1630 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 35010 | |
| Common | 1630 | 4.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 10791 | |
| n | 10592 | |
| g | 9024 | |
| U | 1409 | 4.0% |
| S | 1409 | 4.0% |
| p | 275 | 0.8% |
| r | 270 | 0.8% |
| a | 231 | 0.7% |
| s | 224 | 0.6% |
| G | 214 | 0.6% |
| Other values (15) | 571 | 1.6% |
Common
| Value | Count | Frequency (%) |
| - | 1630 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 36640 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 10791 | |
| n | 10592 | |
| g | 9024 | |
| - | 1630 | 4.4% |
| U | 1409 | 3.8% |
| S | 1409 | 3.8% |
| p | 275 | 0.8% |
| r | 270 | 0.7% |
| a | 231 | 0.6% |
| s | 224 | 0.6% |
| Other values (16) | 785 | 2.1% |
num_pages
Real number (ℝ)
| Distinct | 997 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 336.37692 |
| Minimum | 0 |
|---|---|
| Maximum | 6576 |
| Zeros | 76 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 48 |
| Q1 | 192 |
| median | 299 |
| Q3 | 416 |
| 95-th percentile | 752 |
| Maximum | 6576 |
| Range | 6576 |
| Interquartile range (IQR) | 224 |
Descriptive statistics
| Standard deviation | 241.12731 |
|---|---|
| Coefficient of variation (CV) | 0.71683665 |
| Kurtosis | 62.422129 |
| Mean | 336.37692 |
| Median Absolute Deviation (MAD) | 107 |
| Skewness | 4.2717867 |
| Sum | 3742866 |
| Variance | 58142.377 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 288 | 230 | 2.1% |
| 192 | 221 | 2.0% |
| 320 | 218 | 2.0% |
| 256 | 207 | 1.9% |
| 352 | 202 | 1.8% |
| 224 | 198 | 1.8% |
| 208 | 178 | 1.6% |
| 304 | 177 | 1.6% |
| 240 | 173 | 1.6% |
| 384 | 172 | 1.5% |
| Other values (987) | 9151 |
| Value | Count | Frequency (%) |
| 0 | 76 | |
| 1 | 11 | 0.1% |
| 2 | 15 | 0.1% |
| 3 | 19 | 0.2% |
| 4 | 11 | 0.1% |
| 5 | 16 | 0.1% |
| 6 | 20 | 0.2% |
| 7 | 6 | 0.1% |
| 8 | 10 | 0.1% |
| 9 | 11 | 0.1% |
| Value | Count | Frequency (%) |
| 6576 | 1 | |
| 4736 | 1 | |
| 3400 | 1 | |
| 3342 | 1 | |
| 3020 | 1 | |
| 2751 | 1 | |
| 2690 | 1 | |
| 2480 | 1 | |
| 2264 | 1 | |
| 2198 | 1 |
ratings_count
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 5294 |
|---|---|
| Distinct (%) | 47.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17936.409 |
| Minimum | 0 |
|---|---|
| Maximum | 4597666 |
| Zeros | 81 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 104 |
| median | 745 |
| Q3 | 4993.5 |
| 95-th percentile | 61096 |
| Maximum | 4597666 |
| Range | 4597666 |
| Interquartile range (IQR) | 4889.5 |
Descriptive statistics
| Standard deviation | 112479.44 |
|---|---|
| Coefficient of variation (CV) | 6.2710123 |
| Kurtosis | 442.42766 |
| Mean | 17936.409 |
| Median Absolute Deviation (MAD) | 727 |
| Skewness | 17.697061 |
| Sum | 1.9957842 × 108 |
| Variance | 1.2651625 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 82 | 0.7% |
| 0 | 81 | 0.7% |
| 1 | 76 | 0.7% |
| 4 | 71 | 0.6% |
| 2 | 71 | 0.6% |
| 5 | 61 | 0.5% |
| 9 | 60 | 0.5% |
| 8 | 59 | 0.5% |
| 6 | 57 | 0.5% |
| 7 | 56 | 0.5% |
| Other values (5284) | 10453 |
| Value | Count | Frequency (%) |
| 0 | 81 | |
| 1 | 76 | |
| 2 | 71 | |
| 3 | 82 | |
| 4 | 71 | |
| 5 | 61 | |
| 6 | 57 | |
| 7 | 56 | |
| 8 | 59 | |
| 9 | 60 |
| Value | Count | Frequency (%) |
| 4597666 | 1 | |
| 2530894 | 1 | |
| 2457092 | 1 | |
| 2418736 | 1 | |
| 2339585 | 1 | |
| 2293963 | 1 | |
| 2153167 | 1 | |
| 2128944 | 1 | |
| 2111750 | 1 | |
| 2095690 | 1 |
text_reviews_count
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 1822 |
|---|---|
| Distinct (%) | 16.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 541.8545 |
| Minimum | 0 |
|---|---|
| Maximum | 94265 |
| Zeros | 625 |
| Zeros (%) | 5.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 9 |
| median | 46 |
| Q3 | 237.5 |
| 95-th percentile | 2158.7 |
| Maximum | 94265 |
| Range | 94265 |
| Interquartile range (IQR) | 228.5 |
Descriptive statistics
| Standard deviation | 2576.1766 |
|---|---|
| Coefficient of variation (CV) | 4.7543697 |
| Kurtosis | 396.701 |
| Mean | 541.8545 |
| Median Absolute Deviation (MAD) | 44 |
| Skewness | 16.177845 |
| Sum | 6029215 |
| Variance | 6636685.9 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 625 | 5.6% |
| 1 | 458 | 4.1% |
| 2 | 354 | 3.2% |
| 3 | 263 | 2.4% |
| 4 | 249 | 2.2% |
| 5 | 223 | 2.0% |
| 6 | 200 | 1.8% |
| 7 | 180 | 1.6% |
| 9 | 164 | 1.5% |
| 8 | 162 | 1.5% |
| Other values (1812) | 8249 |
| Value | Count | Frequency (%) |
| 0 | 625 | |
| 1 | 458 | |
| 2 | 354 | |
| 3 | 263 | |
| 4 | 249 | 2.2% |
| 5 | 223 | 2.0% |
| 6 | 200 | 1.8% |
| 7 | 180 | 1.6% |
| 8 | 162 | 1.5% |
| 9 | 164 | 1.5% |
| Value | Count | Frequency (%) |
| 94265 | 1 | |
| 86881 | 1 | |
| 56604 | 1 | |
| 55843 | 1 | |
| 52759 | 1 | |
| 47951 | 1 | |
| 47620 | 1 | |
| 46176 | 1 | |
| 43499 | 1 | |
| 36325 | 1 |
publication_date
Text
| Distinct | 3679 |
|---|---|
| Distinct (%) | 33.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.1 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.6969534 |
| Min length | 9 |
Characters and Unicode
| Total characters | 107898 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2022 ? |
|---|---|
| Unique (%) | 18.2% |
Sample
| 1st row | 05-01-1977 |
|---|---|
| 2nd row | 04-06-2004 |
| 3rd row | 4/20/2004 |
| 4th row | 1/15/1999 |
| 5th row | 9/16/2006 |
| Value | Count | Frequency (%) |
| 10-01-2005 | 56 | 0.5% |
| 11-01-2005 | 53 | 0.5% |
| 09-01-2006 | 51 | 0.5% |
| 10-01-2006 | 48 | 0.4% |
| 11-01-2006 | 40 | 0.4% |
| 07-01-2004 | 39 | 0.4% |
| 08-01-2006 | 39 | 0.4% |
| 08-01-2005 | 37 | 0.3% |
| 07-01-2003 | 37 | 0.3% |
| 10-01-2004 | 37 | 0.3% |
| Other values (3669) | 10690 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 28827 | |
| 1 | 15692 | |
| 2 | 13218 | |
| - | 13144 | |
| / | 9110 | 8.4% |
| 9 | 8403 | 7.8% |
| 6 | 3780 | 3.5% |
| 5 | 3683 | 3.4% |
| 3 | 3359 | 3.1% |
| 4 | 3084 | 2.9% |
| Other values (2) | 5598 | 5.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 85644 | |
| Dash Punctuation | 13144 | 12.2% |
| Other Punctuation | 9110 | 8.4% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 28827 | |
| 1 | 15692 | |
| 2 | 13218 | |
| 9 | 8403 | 9.8% |
| 6 | 3780 | 4.4% |
| 5 | 3683 | 4.3% |
| 3 | 3359 | 3.9% |
| 4 | 3084 | 3.6% |
| 7 | 2826 | 3.3% |
| 8 | 2772 | 3.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 13144 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 9110 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 107898 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 28827 | |
| 1 | 15692 | |
| 2 | 13218 | |
| - | 13144 | |
| / | 9110 | 8.4% |
| 9 | 8403 | 7.8% |
| 6 | 3780 | 3.5% |
| 5 | 3683 | 3.4% |
| 3 | 3359 | 3.1% |
| 4 | 3084 | 2.9% |
| Other values (2) | 5598 | 5.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 107898 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 28827 | |
| 1 | 15692 | |
| 2 | 13218 | |
| - | 13144 | |
| / | 9110 | 8.4% |
| 9 | 8403 | 7.8% |
| 6 | 3780 | 3.5% |
| 5 | 3683 | 3.4% |
| 3 | 3359 | 3.1% |
| 4 | 3084 | 2.9% |
| Other values (2) | 5598 | 5.2% |
publisher
Text
| Distinct | 2292 |
|---|---|
| Distinct (%) | 20.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.1 KiB |
Length
| Max length | 67 |
|---|---|
| Median length | 52 |
| Mean length | 15.180372 |
| Min length | 2 |
Characters and Unicode
| Total characters | 168912 |
|---|---|
| Distinct characters | 140 |
| Distinct categories | 11 ? |
| Distinct scripts | 6 ? |
| Distinct blocks | 7 ? |
Unique
| Unique | 1296 ? |
|---|---|
| Unique (%) | 11.6% |
Sample
| 1st row | Brown Son & Ferguson Ltd. |
|---|---|
| 2nd row | Cold Spring Press |
| 3rd row | Harvard University Press |
| 4th row | Huntington House Publishers |
| 5th row | Scholastic Inc. |
| Value | Count | Frequency (%) |
| books | 2302 | 9.3% |
| press | 1316 | 5.3% |
| penguin | 598 | 2.4% |
| university | 552 | 2.2% |
| publishing | 511 | 2.1% |
| vintage | 409 | 1.6% |
| 353 | 1.4% | |
| classics | 344 | 1.4% |
| company | 331 | 1.3% |
| house | 320 | 1.3% |
| Other values (2016) | 17765 |
Most occurring characters
| Value | Count | Frequency (%) |
| 14220 | 8.4% | |
| o | 12967 | 7.7% |
| e | 12949 | 7.7% |
| s | 11914 | 7.1% |
| r | 11785 | 7.0% |
| i | 10689 | 6.3% |
| a | 10476 | 6.2% |
| n | 10462 | 6.2% |
| l | 6209 | 3.7% |
| t | 5820 | 3.4% |
| Other values (130) | 61421 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 126142 | |
| Uppercase Letter | 26043 | 15.4% |
| Space Separator | 14220 | 8.4% |
| Other Punctuation | 1640 | 1.0% |
| Other Letter | 235 | 0.1% |
| Open Punctuation | 234 | 0.1% |
| Close Punctuation | 234 | 0.1% |
| Dash Punctuation | 101 | 0.1% |
| Decimal Number | 56 | < 0.1% |
| Final Punctuation | 4 | < 0.1% |
Most frequent character per category
Other Letter
| Value | Count | Frequency (%) |
| 社 | 18 | 7.7% |
| 館 | 16 | 6.8% |
| 学 | 16 | 6.8% |
| 小 | 16 | 6.8% |
| 英 | 13 | 5.5% |
| 集 | 12 | 5.1% |
| ン | 10 | 4.3% |
| ガ | 10 | 4.3% |
| 東 | 8 | 3.4% |
| 立 | 8 | 3.4% |
| Other values (34) | 108 |
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 12967 | |
| e | 12949 | |
| s | 11914 | |
| r | 11785 | |
| i | 10689 | 8.5% |
| a | 10476 | 8.3% |
| n | 10462 | 8.3% |
| l | 6209 | 4.9% |
| t | 5820 | 4.6% |
| u | 4146 | 3.3% |
| Other values (31) | 28725 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 4241 | |
| B | 3806 | |
| C | 2151 | 8.3% |
| S | 1724 | 6.6% |
| H | 1713 | 6.6% |
| A | 1365 | 5.2% |
| M | 1283 | 4.9% |
| L | 1078 | 4.1% |
| D | 931 | 3.6% |
| R | 851 | 3.3% |
| Other values (18) | 6900 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 14 | |
| 3 | 13 | |
| 0 | 11 | |
| 2 | 5 | 8.9% |
| 8 | 5 | 8.9% |
| 4 | 4 | 7.1% |
| 7 | 2 | 3.6% |
| 9 | 1 | 1.8% |
| 6 | 1 | 1.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 833 | |
| ' | 330 | 20.1% |
| & | 329 | 20.1% |
| / | 133 | 8.1% |
| : | 7 | 0.4% |
| ; | 4 | 0.2% |
| " | 2 | 0.1% |
| ! | 2 | 0.1% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ̄ | 1 | |
| ̃ | 1 | |
| ́ | 1 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 233 | |
| [ | 1 | 0.4% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 233 | |
| ] | 1 | 0.4% |
Space Separator
| Value | Count | Frequency (%) |
| 14220 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 101 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 4 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 152180 | |
| Common | 16489 | 9.8% |
| Han | 186 | 0.1% |
| Katakana | 49 | < 0.1% |
| Cyrillic | 5 | < 0.1% |
| Inherited | 3 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 12967 | 8.5% |
| e | 12949 | 8.5% |
| s | 11914 | 7.8% |
| r | 11785 | 7.7% |
| i | 10689 | 7.0% |
| a | 10476 | 6.9% |
| n | 10462 | 6.9% |
| l | 6209 | 4.1% |
| t | 5820 | 3.8% |
| P | 4241 | 2.8% |
| Other values (54) | 54668 |
Han
| Value | Count | Frequency (%) |
| 社 | 18 | 9.7% |
| 館 | 16 | 8.6% |
| 学 | 16 | 8.6% |
| 小 | 16 | 8.6% |
| 英 | 13 | 7.0% |
| 集 | 12 | 6.5% |
| 東 | 8 | 4.3% |
| 立 | 8 | 4.3% |
| 講 | 6 | 3.2% |
| 版 | 6 | 3.2% |
| Other values (24) | 67 |
Common
| Value | Count | Frequency (%) |
| 14220 | ||
| . | 833 | 5.1% |
| ' | 330 | 2.0% |
| & | 329 | 2.0% |
| ( | 233 | 1.4% |
| ) | 233 | 1.4% |
| / | 133 | 0.8% |
| - | 101 | 0.6% |
| 1 | 14 | 0.1% |
| 3 | 13 | 0.1% |
| Other values (14) | 50 | 0.3% |
Katakana
| Value | Count | Frequency (%) |
| ン | 10 | |
| ガ | 10 | |
| ス | 6 | |
| コ | 5 | |
| ク | 5 | |
| ミ | 5 | |
| ッ | 5 | |
| ロ | 1 | 2.0% |
| ブ | 1 | 2.0% |
| ビ | 1 | 2.0% |
Cyrillic
| Value | Count | Frequency (%) |
| с | 1 | |
| о | 1 | |
| м | 1 | |
| Э | 1 | |
| к | 1 |
Inherited
| Value | Count | Frequency (%) |
| ̄ | 1 | |
| ̃ | 1 | |
| ́ | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 168602 | |
| CJK | 186 | 0.1% |
| None | 63 | < 0.1% |
| Katakana | 49 | < 0.1% |
| Cyrillic | 5 | < 0.1% |
| Punctuation | 4 | < 0.1% |
| Diacriticals | 3 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 14220 | 8.4% | |
| o | 12967 | 7.7% |
| e | 12949 | 7.7% |
| s | 11914 | 7.1% |
| r | 11785 | 7.0% |
| i | 10689 | 6.3% |
| a | 10476 | 6.2% |
| n | 10462 | 6.2% |
| l | 6209 | 3.7% |
| t | 5820 | 3.5% |
| Other values (65) | 61111 |
None
| Value | Count | Frequency (%) |
| é | 27 | |
| ü | 8 | 12.7% |
| ç | 6 | 9.5% |
| ë | 5 | 7.9% |
| É | 4 | 6.3% |
| ı | 3 | 4.8% |
| í | 3 | 4.8% |
| ó | 2 | 3.2% |
| ö | 2 | 3.2% |
| ñ | 1 | 1.6% |
| Other values (2) | 2 | 3.2% |
CJK
| Value | Count | Frequency (%) |
| 社 | 18 | 9.7% |
| 館 | 16 | 8.6% |
| 学 | 16 | 8.6% |
| 小 | 16 | 8.6% |
| 英 | 13 | 7.0% |
| 集 | 12 | 6.5% |
| 東 | 8 | 4.3% |
| 立 | 8 | 4.3% |
| 講 | 6 | 3.2% |
| 版 | 6 | 3.2% |
| Other values (24) | 67 |
Katakana
| Value | Count | Frequency (%) |
| ン | 10 | |
| ガ | 10 | |
| ス | 6 | |
| コ | 5 | |
| ク | 5 | |
| ミ | 5 | |
| ッ | 5 | |
| ロ | 1 | 2.0% |
| ブ | 1 | 2.0% |
| ビ | 1 | 2.0% |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 4 |
Cyrillic
| Value | Count | Frequency (%) |
| с | 1 | |
| о | 1 | |
| м | 1 | |
| Э | 1 | |
| к | 1 |
Diacriticals
| Value | Count | Frequency (%) |
| ̄ | 1 | |
| ̃ | 1 | |
| ́ | 1 |
| average_rating | bookID | isbn13 | language_code | num_pages | ratings_count | text_reviews_count | |
|---|---|---|---|---|---|---|---|
| average_rating | 1.000 | -0.037 | 0.054 | 0.101 | 0.110 | 0.087 | 0.033 |
| bookID | -0.037 | 1.000 | 0.041 | 0.050 | -0.010 | -0.099 | -0.112 |
| isbn13 | 0.054 | 0.041 | 1.000 | 0.000 | -0.137 | -0.252 | -0.264 |
| language_code | 0.101 | 0.050 | 0.000 | 1.000 | 0.016 | -0.048 | -0.054 |
| num_pages | 0.110 | -0.010 | -0.137 | 0.016 | 1.000 | 0.185 | 0.168 |
| ratings_count | 0.087 | -0.099 | -0.252 | -0.048 | 0.185 | 1.000 | 0.959 |
| text_reviews_count | 0.033 | -0.112 | -0.264 | -0.054 | 0.168 | 0.959 | 1.000 |
| bookID | title | authors | average_rating | isbn | isbn13 | language_code | num_pages | ratings_count | text_reviews_count | publication_date | publisher | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 34889 | Brown's Star Atlas: Showing All The Bright Stars With Full Instructions How To Find And Use Them For Navigational Purposes And Department Of Trade Examinations. | Brown Son & Ferguson | 0.00 | 851742718 | 9,780,850,000,000.00 | eng | 49 | 0 | 0 | 05-01-1977 | Brown Son & Ferguson Ltd. |
| 1 | 16914 | The Tolkien Fan's Medieval Reader | David E. Smith (Turgon of TheOneRing.net one of the founding members of this Tolkien website)/Verlyn Flieger/Turgon (=David E. Smith) | 3.58 | 1593600119 | 9,781,590,000,000.00 | eng | 400 | 26 | 4 | 04-06-2004 | Cold Spring Press |
| 2 | 12224 | Streetcar Suburbs: The Process of Growth in Boston 1870-1900 | Sam Bass Warner Jr./Sam B. Warner | 3.58 | 674842111 | 9,780,670,000,000.00 | en-US | 236 | 61 | 6 | 4/20/2004 | Harvard University Press |
| 3 | 22128 | Patriots (The Coming Collapse) | James Wesley Rawles | 3.63 | 156384155X | 9,781,560,000,000.00 | eng | 342 | 38 | 4 | 1/15/1999 | Huntington House Publishers |
| 4 | 1 | Harry Potter and the Half-Blood Prince (Harry Potter #6) | J.K. Rowling/Mary GrandPré | 4.57 | 439785960 | 9,780,440,000,000.00 | eng | 652 | 2095690 | 27591 | 9/16/2006 | Scholastic Inc. |
| 5 | 2 | Harry Potter and the Order of the Phoenix (Harry Potter #5) | J.K. Rowling/Mary GrandPré | 4.49 | 439358078 | 9,780,440,000,000.00 | eng | 870 | 2153167 | 29221 | 09-01-2004 | Scholastic Inc. |
| 6 | 4 | Harry Potter and the Chamber of Secrets (Harry Potter #2) | J.K. Rowling | 4.42 | 439554896 | 9,780,440,000,000.00 | eng | 352 | 6333 | 244 | 11-01-2003 | Scholastic |
| 7 | 5 | Harry Potter and the Prisoner of Azkaban (Harry Potter #3) | J.K. Rowling/Mary GrandPré | 4.56 | 043965548X | 9,780,440,000,000.00 | eng | 435 | 2339585 | 36325 | 05-01-2004 | Scholastic Inc. |
| 8 | 8 | Harry Potter Boxed Set Books 1-5 (Harry Potter #1-5) | J.K. Rowling/Mary GrandPré | 4.78 | 439682584 | 9,780,440,000,000.00 | eng | 2690 | 41428 | 164 | 9/13/2004 | Scholastic |
| 9 | 9 | Unauthorized Harry Potter Book Seven News: "Half-Blood Prince" Analysis and Speculation | W. Frederick Zimmerman | 3.74 | 976540606 | 9,780,980,000,000.00 | en-US | 152 | 19 | 1 | 4/26/2005 | Nimble Books |
| bookID | title | authors | average_rating | isbn | isbn13 | language_code | num_pages | ratings_count | text_reviews_count | publication_date | publisher | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11117 | 45617 | O Cavalo e o Seu Rapaz (As Crónicas de Nárnia #3) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 3.92 | 9722330551 | 9,789,720,000,000.00 | por | 160 | 207 | 16 | 8/15/2003 | Editorial Presença |
| 11118 | 45623 | O Sobrinho do Mágico (As Crónicas de Nárnia #1) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 4.04 | 9722329987 | 9,789,720,000,000.00 | por | 147 | 396 | 37 | 04-08-2003 | Editorial Presença |
| 11119 | 45625 | A Viagem do Caminheiro da Alvorada (As Crónicas de Nárnia #5) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 4.09 | 9722331329 | 9,789,720,000,000.00 | por | 176 | 161 | 14 | 09-01-2004 | Editorial Presença |
| 11120 | 45626 | O Príncipe Caspian (As Crónicas de Nárnia #4) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 3.97 | 9722330977 | 9,789,720,000,000.00 | por | 160 | 215 | 11 | 10-11-2003 | Editorial Presença |
| 11121 | 45630 | Whores for Gloria | William T. Vollmann | 3.69 | 140231579 | 9,780,140,000,000.00 | en-US | 160 | 932 | 111 | 02-01-1994 | Penguin Books |
| 11122 | 45631 | Expelled from Eden: A William T. Vollmann Reader | William T. Vollmann/Larry McCaffery/Michael Hemmingson | 4.06 | 1560254416 | 9,781,560,000,000.00 | eng | 512 | 156 | 20 | 12/21/2004 | Da Capo Press |
| 11123 | 45633 | You Bright and Risen Angels | William T. Vollmann | 4.08 | 140110879 | 9,780,140,000,000.00 | eng | 635 | 783 | 56 | 12-01-1988 | Penguin Books |
| 11124 | 45634 | The Ice-Shirt (Seven Dreams #1) | William T. Vollmann | 3.96 | 140131965 | 9,780,140,000,000.00 | eng | 415 | 820 | 95 | 08-01-1993 | Penguin Books |
| 11125 | 45639 | Poor People | William T. Vollmann | 3.72 | 60878827 | 9,780,060,000,000.00 | eng | 434 | 769 | 139 | 2/27/2007 | Ecco |
| 11126 | 45641 | Las aventuras de Tom Sawyer | Mark Twain | 3.91 | 8497646983 | 9,788,500,000,000.00 | spa | 272 | 113 | 12 | 5/28/2006 | Edimat Libros |